An exactly solvable model for a ^-hairpin with 
random interactions 



Marco Zamparo 

Dipartimento di Fisica, INFN sezione di Torino and CNISM, 
Politecnico di Torino, Corso Duca degli Abruzzi 24, Torino, Italy 

E-mail: marco.zamparo@polito.it 

Abstract. I investigate a disordered version of a simplified model of protein folding, 
with binary degrees of freedom, applied to an ideal /3-hairpin structure. Disorder 
is introduced by assuming that the contact energies are independent and identically 
distributed random variables. The equilibrium free-energy of the model is studied, 
performing the exact calculation of its quenched value and proving the self-averaging 
feature. 
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1. Introduction 

The present paper is devoted to the analysis of a simple disordered model for an ideal 
/3-hairpin structure, for which some exact results may be derived. Disordered models 
originate very intricated scenario and their study needs new mathematical methods and 
algorithms; reffering to plain models with a known solution could be helpful to test 
them. 

The model I consider is a disordered version of one introduced by Wako and Saito 
[Tj |2] in 1978 and independently reintroduced by Munoz and co-workers [31 SJ |5] in the 
late 90 's to inquire into the problem of protein folding. The Wako-Saito-Munoz-Eaton 
(WSME) model is a highly simplified one where the purpose is describing the equilibrium 
of the protein folding process under the assumption that it is mainly determined by 
the structure of the native state (the functional state of a protein), whose knowledge is 
assumed. It is a one-dimensional model, with long-range, many-body interactions, where 
a binary variable is associated to each peptide bond (the bond connecting consecutive 
aminoacids), denoting the native and unfolded conformation. Two aminoacids can 
interact only if they are in contact in the native state and all the peptide bonds between 
them are ordered. Moreover an entropic cost is associated with each ordered bond. 

Many papers have been published in the last few years concerning the equilibrium 
properties of the model and its exact solution [HI IE] , its kinetics [HI QUI [H] and some 
generalizations to the problem of mechanical unfolding [121 [13]. I n particular in [6] the 
exact solution for a homogeneous /3-hairpin structure was given, while in [7] one can 
find the exact treatment in the general case. Recently the model has been applied to 
the analysis of real proteins [HI [151 EE El HH1 HHJ I2Q1 [21] and, rather interestingly, in 
a problem of strained epitaxy [221 E31 El] • 

In order to introduce some disorder in the WSME model, I suppose the contact 
energies are independent quenched variables. This assumption has been done for the 
base pairing energies in some models for the ribonucleic acid (RNA) secondary structure 
[25], where one aims at retaining the spirit of Watson-Crick pairing that interactions 
between some specific bases are favoured with respect to the others. However, even 
if the /3-hairpin structure mimics the zipper features of the RNA secondary structure, 
the purpose of this paper is the modest one of proposing a simple exactly solvable 
disordered model, calculating the free-energy and proving its self-averaging property. 
The computation of the quenched free-energy, i.e. the average of the free-energy over 
the quenched disorder, will be provided avoiding the replica theory [26] and making use 
of some properties of the free-energy itself which will be proven rigorously in advance. 

The paper is organised as follow: in Section [2] the WSME model and its disordered 
version for the /3-hairpin structure are introduced. Section [3]is devoted to the calculation 
of the quenched free-energy and SectionHlto prove self-averaging. Conclusions are drawn 
in Section [51 
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2. The model 

The WSME model describes a protein of iV + 1 residues as a chain of iV peptide 
bonds connecting consecutive aminoacids. In order to identify the native (ordered) 
conformation and distinguish it from the unfolded (disordered) one, a binary variable 
rrik is associated to the peptide bond k, k — 1, . . . , N. Each variable, related to the values 
of the dihedral angles at the same peptide bond, assumes value 1 in the native state 
and otherwise. Since the unfolded state allows a much larger number of microscopic 
realizations than the native one, an entropic cost q k is given to the ordering of the peptide 
bond k. The main assumption about the interactions is that two bonds can interact 
only if they are in contact in the native state (so that the model can be classified as 
Go-like p7] ) and all bonds between them are ordered. 

The Hamiltonian of the model (an effective free-energy, properly speaking) reads 
2V-1 n j N 

H N (m) = 2j ^2 ''./-X./ II'"/.- + fc £ T zJ 9fcmfc ' (1) 

i=l j=i+l k=i k=l 

where T is the absolute temperature. The product ni;=i m fc takes value 1 if and only 
if all the peptide bonds going from i to j are ordered, thereby realizing the assumed 
interaction. The contact matrix elements Ay G {0, 1} tell us which bonds are at close 
distance in the native state. Finally, the contact energies 6y < quantify the intensity 
of the contacts. 

An ideal /3-hairpin with an odd number 2N + 1 of peptide bonds is characterized 
by the contact matrix elements Ay equal to 1 if i + j — 2N + 2 and otherwise. The 
structure results in the characteristic Hamiltonian (divided by ksT) 

N N+l+i 2N+1 

H e N (m) = -/3^ei II m k + q ^ m k , (2) 

i=l k=N+l-i fe=l 

where (3 = l/fc^T. 

In this work I concentrate on the case in which ei, . . . , £jv are independent random 
variables identically distributed in a set S C R according to a probability measure P. 
Moreover, in order to deal with a homogeneous model having a thermodynamic limit, 
the entropic cost q k is chosen equal to q for any k, as the comparison between the 
Hamiltonians (Op) and ([2j) shows. I shall assume P is any probability measure satisfying 
the condition J e(££ exp ((3e)P(de) < oo, given an arbitrary real value of (3, and from now 
on I will denote by \x the expectation of the contact energy and with P N the product 
measure P x . . . x P iV-times. 

Let us denote with f N the quenched free-energy (times (3) 

f N (P,q) = -1^1 nogZ N ] = --L-J ]ogZ N (e) P N (de), (3) 

where Z^{e) is the partition function of the model (j5J) given a sequence e = (ei, . . . , ejv) 
of interaction energies: 

Z N (e)= Yl exp[-i^(m)]. (4) 

me{0,l} 2JV+1 
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3. The free-energy 



In this section I show how to compute exactly the quenched free energy, discussing some 
of its properties in advance and then exploiting them to perform the calculation. Let us 
start by observing that, due to the features of the model, it is possible to simplify the 
expression of the partition function Z^. Indeed, summing over the binary variables m,\ 
and m 2 7v+i allows to find the iterative equation [6] 

Z N (e) = (1 + e-iyz^e) + (e** - l)^^ 2N+ ^ 
valid for any iV G N. Joining this relation to the initial condition 

Z 1 {e) = (l + e- <? ) 3 + (e /3ei -l)e- 39 , 
one obtains immediately the expression 



(5) 
(6) 



Z N (e) = (l + e-T J, ^ + 



q\2N+l 

+ X^ e/3e ™ ~ l) e /3E ^=i l£l_ ' 5,{2n+1) (l +e" 9 ) 2(7V " n) . 



N 



(7) 



n=l 



The formula for Z N can still be slightly reduced, as it is stated by the following 
proposition. 

Proposition 1. There exist two positive constants with respect to N, C and D , such 
that 



C 



+ ^ (1 + ei) 2n 

n=l 



< 



Z N (e) 



'l + e -q)2N+l 



< D 



N ,/3E- 



n=l 



+ e 



q\2n 



Before sketching the proof, in order to deal with more compact formulas in the 
following, it is convenient to introduce the new quantities 



N 



and 



1 'N \ 



9n(P, A) 



'Et^i-An 



n=l 



N 



EflogS 



PA] 
N J 



(9) 



(10) 



where the explicit dipendence on (3 and A is taken into account, and rewrite / in the 
form 

f(P, q) = - log(l + e"') - l - g(0, 2 log(l + e«)) (11) 

with g = liniAr^oo g^. The relationship between the free-energy and the model 
parameters comes from the evaluation of the function g, so that I shall focus on g 
rather than /. 
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Proof of Proposition 1. Looking at the expression (J7|) and splitting the term (e^ e ™ — 1) 
in the sum, it is possible to rewrite in the following manner: 

Z N (e) 



' 1+e - 9 )27V+l 



1 + e' 



q\-3 



1 -(1+e 



q\-2 



N-l 

Y- 

^ (i 



1 + e 9 



1 + e? ^(l + e<0 2 ™ ' v ~ ' ~' (l + ei) 2N ' 

n=l v ' v ' 



The statement of the proposition is achieved by choosing 

1- (1 + e?)- 2 



C = mini 1 - (1 + e 



q\-3 



1 + e"? 



1 + e 



«n-i 



and 



D = max<! 1 - (1 + e q )~\ ) ' , (1 + e 9 )" 1 



> 



> 0. 



(12) 



(13) 



(14) 



1 + e? 

Let us now go over the properties of g that shall allow its evaluation. From a 
physical point of view one is interested only in positive values of (3 and A, but for 
analitycal reasons it is convenient to assume /3 and A taking any real value. The first 
property I show concerns the behaviour of g under reflection with respect to the origin. 
Proposition 2. g(f3, A) = — A + g(—(3, —A) where \i is the expectation value of the 
energy contact: 



eP{de). 



(15) 



Proof of Proposition 2. Remembering the definition ([91), we have 



N-l 



"AT 1 



e ) = l + ^ e ^=^- An + e^=i 



n=l 



e,-\N 



N-l 



71=1 



(16) 



and changing n with iV — n in the sum, we can go on writing 

N-l 



N \ e l 



Cat) 



n=l 



^EILi^-ajv 



v 



i — N — n + 



1 ei+Xn 



n=l 
N 



-/3 E?=l ejV-i+i+An 



n=l 



^Eili^-AJV ^-/3,-A 



ejv, ...,ei). (17) 

The connection ( TTUl) between and g^v allows us to conclude immediately the proof. 

The second result I report describes a homogeneity property of g. 
Proposition 3. g(tf3,t\) = tg(/3, A) for any t > 0. 
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Proof of Propostion 3. At first let us suppose t > 1. From the inequality, valid for 

x > 0, 

(1 + x)* > 1 + x* (18) 
and from the convexity of the function x — > x l , x > 0, it follows that 

XX < fx; ^ ^(iv+ir 1 ^^ (19) 

n=0 \n=0 / n=0 

for any integer N and positive numbers do, . . . , a at- This chain of inequalities implies 

(20) 



Ef tx (e) < [~^ A (e)l < (N+iy- 1 ~f tx (e) 



and then g(t(3,tX) = tg((3,X) when t > 1. Bearing in mind the latter point, the 
substitution of /3 with /3/t and A with X/t allows us to prove the proposition also when 
< t < 1. 

Finally we can easily characterize g in a region of the parameter space. 
Proposition 4. g(p, X) = if X > log f £g£ e /3e P(rfe). 

Proof of Proposition 4. Making use of the concavity of the logarithm function, we 
obtain 



0<^(/3,A)<^log 



E^\e)P N (de) 



N 



log 



n=l 



P(rfe 



(21) 



Then ^(/3, A) = if e" A J t££ e^P(de) < 1 or equivalently A > log f ee£ eP e P(de). 

Exploiting these properties, it is now feasible to show the form of the function g 
for the whole parameter space. From proposition 3 and 4 it follows that, given t larger 
than 0, g vanishes if A > \ log J £ e t ^ e P(de). Taking the limit t — > + , this condition 
reduces to A > On the other hand, if A < fin then —A > — and proposition 2 
tells us that g(/3, A) = /3/1 — A, due to the null value of g(—/3 : —A). Let us conclude by 
collecting the previous results in a compact formula by means of the Heaviside function 
9 (0(x) — 1 if x > and otherwise) and 6 defined as Q(x) = x9(x). The following 
holds 

Theorem 1. g(0, A) = (0n - \)9(0(i - A) = 6(/3/i - A). 



4. Self-averaging property 

This section is devoted to the proof of the self-averaging feature of the free-energy. In 
order to quantify the fluctuations of the free-energy let us introduce the function Sn 
defined as 



S N (0,X) =E 



^log^ A -s(/3,A) 



(22) 
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As one can easily verify, given a positive number 5, the probability of having a fluctuation 
larger than or equal to 5 is bounded by S^: 



P 



N 



logHj A -s(/3,A) 



> 5 



< 



S N (I3,X) 
5 ' 



(23) 



where the left-hand side is an usual short notation denoting the probability measure of 
the set of e e £ N such that g(/3, A) - 5 < ± logH^ A (e) < g(p, A) + 5. 

The self-averaging property of the free-energy is described by the fact that Sn 
vanishes in the thermodynamic limit, as the following theorem states 
Theorem 2. S(/3, A) = liniTv^oo Sn{(3, A) = for any real numbers (5 and A. 

In order to prove the theorem it is useful to extend to S the reflection result about g. 
Proposition 5. S(fi, A) = S(-f3,-\). 

Proof of Proposition 5. From relation ( 1TTI) and proposition 2 we have 

1 N 

( o e N )- g(/3, A) = ^— e< - //J + 



1 

iV 



log Sj A (ci. 



8=1 



+ — logH/' X {e N ,...,e 1 ) -g(-p,-\), 



which, passing to absolute values and averaging, yields 



S N (/3,X) - S N (-p,-\) 



< 



i 1 N 

\-Y 



eS NiN . 



P N {de). 



(24) 



(25) 



Thanks to the Cauchy-Schwarz inequality we can go on and reach the result 
S N ((3,\)-S N (-(3,-X)\ < \p\, 



r i n 2 



^P(de). 



(26) 



^ A" v Jeee 

The proof is concluded considering the limit N — > oo. 

Now we can come back to the theorem. 
Proof of Theorem 2. Remembering that g((3,\) = if < A and observing that 
S^ A (e) > 1, we have, when < A, 



S N ((3,X) = E 



0,X 



E 



1 



/3,A 



(27) 



and then liniAr^oo Sn(/3, A) = g(/3, A) = 0. On the other hand, when (3[i > A we obtain 
from proposition 5 that S((3, A) = S(—/3, —A) = since —(3fi < —A. 



5. Conclusions 

In the previous sections we focused on the function g, since its study was equivalent to 
that of the free-energy /. Now we can come back to the expression ffTTl) and thanks to 
the theorem 1 write the final formula 



fifi, q) = - log(l + e- 9 ) - l - 0(/3/i - 2 log(l + e")). 



(2* 
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The free-energy inherits the self- averaging property from g and thus its behaviour is 
completely characterized. 

The continuous function B(x) has a discontinuity in the first derivative at x = 
showing that a first order phase transition occurs at the critical value (3 c (q) = - logfT+e 9 ) 
of j3. This critical point is associated to the transition between a disordered phase, the 
unfolded state of the peptide, and an ordered one, the native state, pointing out a 
two-state behaviour. 

The transition can be better characterized by means of an order parameter p^, 
function of (3 and q, measuring the level of the order in the system. We can choose 
Pn as the thermal and then quenched average of the fraction of native bonds. From 
definitions (J2J), (J3J) and (j3J) it follows the result 



At low temperature, (3 > c (q), all the peptide bonds are ordered and the protein is 
in its native state. The relationship between /3 C and the expectation contact energy 
H implies that no ordering can occur at physical temperature when the interaction is 
repulsive in average (/x < 0). 

Let us observe lastly that the free-energy is the same as in a model with no disorder 
and contact energies fixed at the value \i. This means that the quenched disorder does 
not affect the critical behaviour and the transition remains sharp of the first order, 
as in the pure case. This feature could not be considered manifest a priori, since, as 
far I know, no general result is available for models with long-range and many-body 
interactions in the presence of quenched disorder. 

Concluding, in this paper I have studied and solved exactly a simple disordered 
model, showing at first the mathematical expression of the quenched free-energy and 
then characterising completely the distribution of the free-energy by proving its self- 
average feature. The replica trick has been avoided since a more straightforward way 
has been found to reach the desired results. I believe these might turn out to be helpful as 
a benchmark for testing methods from disordered system theory, where exact solutions 
are quite rare. 
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